Digitale Zaubertinte: Steganography workshop¶

Author: Martin Beneš, Verena Lachner

No description has been provided for this image No description has been provided for this image
github.com/uibk-uncover/mip-stego-demo

Let's play a coin game!

No description has been provided for this image
Source: https://giphy.com/

$1000$ people toss a coin $100$ times, and count heads.

In [4]:
import numpy as np
toss = np.random.choice([0, 1], size=[1000, 100])  # toss
toss_sums = np.sum(toss, axis=1)  # count heads
In [8]:
import seaborn as sns
sns.histplot(toss_sums, discrete=True);  # plot histogram
No description has been provided for this image

Histograms allows us to analyze data distribution graphically.

Playing with unfair coin

No description has been provided for this image
Source: https://giphy.com/

Our unfair coin yields heads with a probability $52\%$.

In [9]:
toss2 = np.random.choice([0, 1], size=[1000, 100], p=[.48, .52])
toss2_sums = np.sum(toss2, axis=1)
In [10]:
sns.histplot(toss_sums, discrete=True, label='0.5');  # fair
sns.histplot(toss2_sums, discrete=True, label='0.52');  # unfair
plt.legend();
No description has been provided for this image

Is the orange histogram same as the blue one?

We can test it using statistics.

$\chi^2$ test compares the shapes of two histograms, $h$ and $g$.

$$X^2=\sum_{i=1}^{N}\frac{(h_i - g_i)^2}{h_i}$$

In [11]:
# histograms
h = np.histogram(toss_sums, bins=range(100+1))[0]
g = np.histogram(toss2_sums, bins=range(100+1))[0]

# test statistics
test_stat = 0
for i in range(100):
    if h[i] > 0:
        test_stat += (h[i] - g[i])**2 / h[i]
In [12]:
test_stat  # test statistic
Out[12]:
197.7241088701197

We convert the test statistic to p-value (using $\chi^2$ distribution).

In [13]:
scipy.stats.chi2.sf(test_stat, 100-1)  # significant if <0.05
Out[13]:
1.4826819136900469e-08

The histograms are different. Something's wrong with the coin!

Enough of coins, back to images!

No description has been provided for this image
Source: https://giphy.com/
In [14]:
# load image
x = np.array(Image.open('../img/nockspitze.png').convert('L'))
plt.imshow(x,cmap = "gray");
No description has been provided for this image
In [15]:
sns.histplot(x.flatten(), discrete=True);  # cover histogram
No description has been provided for this image
In [16]:
sns.histplot(lsbr(x, 1.).flatten(), discrete=True);  # stego histogram
No description has been provided for this image

Let's take a closer look. What is going on with the histogram?

In [17]:
fig, ax = plt.subplots(1, 3, sharey=True)
for i, alpha in enumerate([.0, .5, 1.]):
    y = lsbr(x, alpha).flatten()
    sns.histplot(y, binrange=(125, 175), discrete=True, ax=ax[i]);
    ax[i].set_title(f'{alpha:.1f}');
No description has been provided for this image

LSBr averages the neighbor pairs (even and odd neighbor).

$$\bar{h}_i=\frac{h_{i}+h_{i+1}}{2}$$

In [18]:
# histogram
h, edges = np.histogram(x.flatten(), bins=range(256+1))
# average even-odd pairs
hbar = np.repeat((h[:-1:2] + h[1::2]) / 2, 2)
In [19]:
fig, ax = plt.subplots(1, 2, sharey=True)
ax[0].bar(range(256), h);
ax[1].bar(range(256), hbar);
No description has been provided for this image

Can we use $\chi^2$ test to detect steganography?

If the histogram is similar to the pair-averaged histogram, steganography is present.

$$S=\sum_{i=0}^{255}\frac{(h_i-\bar{h}_i)^2}{\bar{h}_i}$$

In [20]:
# Avoid division by zero
h = h[hbar > 0]
hbar = hbar[hbar > 0]
# Chi2 test
test_stat = np.sum((h - hbar)**2 / hbar)
pvalue = scipy.stats.chi2.sf(test_stat, 2**8-1)
In [21]:
pvalue  # stego if >0.05
Out[21]:
0.0

P-value is less than $5\%$, histograms are different. The image is cover.

We run the $\chi^2$-test for stego.

In [22]:
def chi2_attack(x):
    # histograms
    h = np.histogram(x.flatten(), bins=range(256+1))[0]
    hbar = np.repeat((h[:-1:2] + h[1::2])/2, 2)
    h, hbar = h[hbar > 0], hbar[hbar > 0]
    # chi2 test
    S = np.sum((h[:-1:2] - hbar[::2])**2 / hbar[::2])
    return scipy.stats.chi2.sf(S, h.size-1)
In [23]:
y = lsbr(x, 1.)  # create stego
pvalue = chi2_attack(y)  # chi2 attack
In [24]:
pvalue  # stego if >0.05
Out[24]:
1.0

P-value is greater than $5\%$, histograms are the same. The image is stego.

Take-away messages¶

  • Steganography distorts image statistics.
  • $\chi^2$ test can detect the presence of LSB replacement steganography.

Hands-on: LSBr¶

  • Find the image(s) with steganography among the suspicious images
  • Try to extract the messages
    • The secret key is the answer to life, the universe, and everything.
In [27]:
# load suspicious images
t21a = np.array(Image.open(f'../img/t21a.png'))
martinswand = np.array(Image.open(f'../img/martinswand.png'))
statue = np.array(Image.open(f'../img/statue.png'))

Run $\chi^2$ test on each image.

In [28]:
chi2_attack(t21a)
Out[28]:
0.0
In [29]:
chi2_attack(statue)
Out[29]:
0.0
In [30]:
chi2_attack(martinswand)
Out[30]:
1.0
In [31]:
key = 42  # answer to life
message = extract_lsbr(martinswand, key=key)  
In [32]:
print(message[:300], '...')
1609

THE SONNETS

by William Shakespeare



                     1
  From fairest creatures we desire increase,
  That thereby beauty's rose might never die,
  But as the riper should by time decease,
  His tender heir might bear his memory:
  But thou contracted to thine own bright eyes,
  Feed's ...
In [33]:
print(extract_lsbr(t21a))  # hidden Easter egg
Hi Verena!
Sorry again that you lost the key from your bike.
Let's meet today by the bike stand to cut the lock.

Martin.